NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Hermes: Algorithm-System Co-design for Efficient Retrieval-Augmented Generation At-Scale

https://doi.org/10.1145/3695053.3731076

Shen, Michael; Umar, Muhammad; Maeng, Kiwan; Suh, G Edward; Gupta, Udit (June 2025, ACM)

Free, publicly-accessible full text available June 20, 2026
Practical Federated Recommendation Model Learning Using ORAM with Controlled Privacy

https://doi.org/10.1145/3676641.3716014

Liu, Jinyu; Xiong, Wenjie; Suh, G Edward; Maeng, Kiwan (March 2025, ACM)

Training high-quality recommendation models requires collecting sensitive user data. The popular privacy-enhancing training method, federated learning (FL), cannot be used practically due to these models’ large embedding tables. This paper introduces FEDORA, a system for training recommendation models with FL. FEDORA allows each user to only download, train, and upload a small subset of the large tables based on their private data, while hiding the access pattern using oblivious memory (ORAM). FEDORA reduces the ORAM’s prohibitive latency and memory overheads by (1) introducing 𝜖-FDP, a formal way to balance the ORAM’s privacy with performance, and (2) placing the large ORAM in a power- and cost-efficient SSD with SSD-friendly optimizations. Additionally, FEDORA is carefully designed to support (3) modern operation modes of FL. FEDORA achieves high model accuracy by using private features during training while achieving, on average, 5× latency and 158× SSD lifetime improvement over the baseline.
more » « less
Free, publicly-accessible full text available March 30, 2026
Efficient Memory Side-Channel Protection for Embedding Generation in Machine Learning

https://doi.org/10.1109/HPCA61900.2025.00041

Umar, Muhammad; Marathe, Akhilesh Parag; Gupta, Monami Dutta; Ghosh, Shubham Jogprakash; Suh, G Edward; Xiong, Wenjie (March 2025, IEEE)

Free, publicly-accessible full text available March 1, 2026
Efficient Privacy-Preserving Machine Learning with Lightweight Trusted Hardware

https://doi.org/10.56553/popets-2024-0119

Huang, Pengzhi; Hoang, Thang; Li, Yueying; Shi, Elaine; Suh, G Edward (October 2024, Proceedings on Privacy Enhancing Technologies)

In this paper, we propose a new secure machine learning inference platform assisted by a small dedicated security processor, which will be easier to protect and deploy compared to today's TEEs integrated into high-performance processors. Our platform provides three main advantages over the state-of-the-art: (i) We achieve significant performance improvements compared to state-of-the-art distributed Privacy-Preserving Machine Learning (PPML) protocols, with only a small security processor that is comparable to a discrete security chip such as the Trusted Platform Module (TPM) or on-chip security subsystems in SoCs similar to the Apple enclave processor. In the semi-honest setting with WAN/GPU, our scheme is 4X-63X faster than Falcon (PoPETs'21) and AriaNN (PoPETs'22) and 3.8X-12X more communication efficient. We achieve even higher performance improvements in the malicious setting. (ii) Our platform guarantees security with abort against malicious adversaries under honest majority assumption. (iii) Our technique is not limited by the size of secure memory in a TEE and can support high-capacity modern neural networks like ResNet18 and Transformer. While previous work investigated the use of high-performance TEEs in PPML, this work represents the first to show that even tiny secure hardware with very limited performance can be leveraged to significantly speed-up distributed PPML protocols if the protocol can be carefully designed for lightweight trusted hardware.
more » « less
Full Text Available
Pangenome-Informed Language Models for Synthetic Genome Sequence Generation

https://doi.org/10.1101/2024.09.18.612131

Huang, Pengzhi; Charton, François; Schmelzle, Jan-Niklas M; Darnell, Shelby S; Prins, Pjotr; Garrison, Erik; Suh, G Edward (September 2024, bioRxiv)

Abstract Language Models (LM) have been extensively utilized for learning DNA sequence patterns and generating synthetic sequences. In this paper, we present a novel approach for the generation of synthetic DNA data using pangenomes in combination with LM. We introduce three innovative pangenome-based tokenization schemes, including two that can decouple from private data, while enhance long DNA sequence generation. Our experimental results demonstrate the superiority of pangenome-based tokenization over classical methods in generating high-utility synthetic DNA sequences, highlighting a promising direction for the public sharing of genomic datasets.
more » « less
Full Text Available
Strong Asymptotic Composition Theorems for Mutual Information Measures

https://doi.org/10.1109/TIT.2022.3228519

Wu, Benjamin; Wagner, Aaron B; Issa, Ibrahim; Suh, G Edward (May 2024, IEEE Transactions on Information Theory)

Full Text Available
LibPreemptible: Enabling Fast, Adaptive, and Hardware-Assisted User-Space Scheduling

https://doi.org/10.1109/HPCA57654.2024.00075

Li, Yueying; Lazarev, Nikita; Koufaty, David; Yin, Tenny; Anderson, Andy; Zhang, Zhiru; Suh, G Edward; Kaffes, Kostis; Delimitrou, Christina (March 2024, International Symposium on High Performance Computer Architecture)

Full Text Available
Creating a biomedical knowledge base by addressing GPT inaccurate responses and benchmarking context

https://doi.org/10.1101/2024.10.16.618663

Darnell, S Solomon; Overall, Rupert W; Guarracino, Andrea; Colonna, Vicenza; Villani, Flavia; Garrison, Erik; Isaac, Arun; Muli, Priscilla; Muriithi, Frederick Muriuki; Kabui, Alexander; et al (October 2024, bioRxiv)

We created GNQA, a generative pre-trained transformer (GPT) knowledge base driven by a performant retrieval augmented generation (RAG) with a focus on aging, dementia, Alzheimer’s and diabetes. We uploaded a corpus of three thousand peer reviewed publications on these topics into the RAG. To address concerns about inaccurate responses and GPT ‘hallucinations’, we implemented a context provenance tracking mechanism that enables researchers to validate responses against the original material and to get references to the original papers. To assess the effectiveness of contextual information we collected evaluations and feedback from both domain expert users and ‘citizen scientists’ on the relevance of GPT responses. A key innovation of our study is automated evaluation by way of a RAG assessment system (RAGAS). RAGAS combines human expert assessment with AI-driven evaluation to measure the effectiveness of RAG systems. When evaluating the responses to their questions, human respondents give a “thumbs-up” 76% of the time. Meanwhile, RAGAS scores 90% on answer relevance on questions posed by experts. And when GPT-generates questions, RAGAS scores 74% on answer relevance. With RAGAS we created a benchmark that can be used to continuously assess the performance of our knowledge base. Full GNQA functionality is embedded in the freeGeneNetwork.orgweb service, an open-source system containing over 25 years of experimental data on model organisms and human. The code developed for this study is published under a free and open-source software license athttps://git.genenetwork.org/gn-ai/tree/README.md.
more » « less
Full Text Available
MGX: Near-zero Overhead Memory Protection for Data-intensive Accelerators

https://doi.org/10.1145/3470496.3527418

Hua, Weizhe; Umar, Muhammad; Zhang, Zhiru; Suh, G. Edward (June 2022, Proceedings of the 49th Annual International Symposium on Computer Architecture)

Full Text Available
SoftVN: efficient memory protection via software-provided version numbers

https://doi.org/10.1145/3470496.3527378

Umar, Muhammad; Hua, Weizhe; Zhang, Zhiru; Suh, G. Edward (June 2022, International Symposium on Computer Architecture)

Full Text Available

« Prev Next »

Search for: All records